Expressing an Image Stream with a Sequence of Natural Sentences
نویسندگان
چکیده
We propose an approach for retrieving a sequence of natural sentences for an image stream. Since general users often take a series of pictures on their special moments, it would better take into consideration of the whole image stream to produce natural language descriptions. While almost all previous studies have dealt with the relation between a single image and a single natural sentence, our work extends both input and output dimension to a sequence of images and a sequence of sentences. To this end, we design a multimodal architecture called coherence recurrent convolutional network (CRCN), which consists of convolutional neural networks, bidirectional recurrent neural networks, and an entity-based local coherence model. Our approach directly learns from vast user-generated resource of blog posts as text-image parallel training data. We demonstrate that our approach outperforms other state-of-the-art candidate methods, using both quantitative measures (e.g. BLEU and top-K recall) and user studies via Amazon Mechanical Turk.
منابع مشابه
Construction of an Expression Vector Containing a Novel Fusion Sequence from Middle Region of NS3 and Truncated Core Genes of Hepatitis C Virus
Background and Aims: DNA constructs containing HCV antigens have become one of the vaccine candidates for induction of anti-HCV cellular and humoral immunity. In this study, we constructed a novel expressing vector harboring a fusion sequence derived from an overlapping fragment in the middle of NS3 and a truncated core fragment to avoid troubles reported to be associated with full gene express...
متن کاملImage Encryption by Using Combination of DNA Sequence and Lattice Map
In recent years, the advancement of digital technology has led to an increase in data transmission on the Internet. Security of images is one of the biggest concern of many researchers. Therefore, numerous algorithms have been presented for image encryption. An efficient encryption algorithm should have high security and low search time along with high complexity.DNA encryption is one of the fa...
متن کاملSplit and Rephrase
We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences. Like sentence simplification, splitting-and-rephrasing has the potential of benefiting both natural language processing and societal applications. Because shorter sentences are generally better processed by NLP systems, it could...
متن کاملCamera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images
In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4...
متن کاملInvestigating the Narrative Skills of Late Talkers Through Sequential Picture Stories
Objectives: The purpose of the present study is to investigate the oral narrative skills of late talkers mostly caused by mental disorders while they try to comprehend a wordless sequential picture story to create and narrate the relevant story. Methods: To this end, 15 (10 male and 5 female) participants were who were the students of a specialized school for physically and mentally retarded...
متن کامل